41 research outputs found
Recommended from our members
Efficient Learning in Heterogeneous Internet of Things Ecosystems
The Internet of Things (IoT) is a growing network of heterogeneous devices, combining various sensing and computing nodes at different scales, which creates a large volume of data. Many IoT applications use machine learning (ML) algorithms to analyze the data. The high computational complexity of ML workloads poses significant computational challenges to IoT computing platforms, which tend to be less-powerful and resource-constrained devices. Transmitting such large volumes of data to the cloud also have various issues such as scalability, security and privacy. In this dissertation, we propose efficient solutions to perform the ML tasks while decreasing power consumption and improving performance. We first leverage the heterogeneous and interconnected nature of the IoT systems, where IoT applications run on many different architectures (e.g., X86 server or ARM-based edge device) while communicating with each other. We present a cross-platform power and performance prediction technique for intelligent task allocation. The proposed technique estimates the time-variant energy consumption with only 7% error across completely different architectures, enabling the intelligent task allocation that saves the energy consumption of 16.5% for state-of-the-art ML workloads.We next show how to further advance the learning procedures towards real-time and online processing by distributing such learning tasks onto the hierarchy of IoT devices. Our solution leverages brain-inspired high-dimensional (HD) computing to derive a new class oflearning algorithms that can easily run on IoT devices, while providing high accuracy comparable to the state-of-the-arts. We present that the HD-based learning algorithms can cover various real-world problems from conventional classification to other cognitive tasks beyond classical MLs such as DNA pattern matching. We demonstrate that the HD-based learning can enable secure, collaborative learning by efficiently distributing a large volume of learning tasks into heterogeneous computing nodes. We have implemented the proposed learning solution on various platforms while offering superior computing efficiency. For example, our solution achieves 486×and 7× performance improvements for each of the training and inference phases on a low-power ARM processor, as compared to state-of-the-art deep learning
QHD: A brain-inspired hyperdimensional reinforcement learning algorithm
Reinforcement Learning (RL) has opened up new opportunities to solve a wide
range of complex decision-making tasks. However, modern RL algorithms, e.g.,
Deep Q-Learning, are based on deep neural networks, putting high computational
costs when running on edge devices. In this paper, we propose QHD, a
Hyperdimensional Reinforcement Learning, that mimics brain properties toward
robust and real-time learning. QHD relies on a lightweight brain-inspired model
to learn an optimal policy in an unknown environment. We first develop a novel
mathematical foundation and encoding module that maps state-action space into
high-dimensional space. We accordingly develop a hyperdimensional regression
model to approximate the Q-value function. The QHD-powered agent makes
decisions by comparing Q-values of each possible action. We evaluate the effect
of the different RL training batch sizes and local memory capacity on the QHD
quality of learning. Our QHD is also capable of online learning with tiny local
memory capacity, which can be as small as the training batch size. QHD provides
real-time learning by further decreasing the memory capacity and the batch
size. This makes QHD suitable for highly-efficient reinforcement learning in
the edge environment, where it is crucial to support online and real-time
learning. Our solution also supports a small experience replay batch size that
provides 12.3 times speedup compared to DQN while ensuring minimal quality
loss. Our evaluation shows QHD capability for real-time learning, providing
34.6 times speedup and significantly better quality of learning than
state-of-the-art deep RL algorithms
Recommended from our members
Efficient Learning in Heterogeneous Internet of Things Ecosystems
The Internet of Things (IoT) is a growing network of heterogeneous devices, combining various sensing and computing nodes at different scales, which creates a large volume of data. Many IoT applications use machine learning (ML) algorithms to analyze the data. The high computational complexity of ML workloads poses significant computational challenges to IoT computing platforms, which tend to be less-powerful and resource-constrained devices. Transmitting such large volumes of data to the cloud also have various issues such as scalability, security and privacy. In this dissertation, we propose efficient solutions to perform the ML tasks while decreasing power consumption and improving performance. We first leverage the heterogeneous and interconnected nature of the IoT systems, where IoT applications run on many different architectures (e.g., X86 server or ARM-based edge device) while communicating with each other. We present a cross-platform power and performance prediction technique for intelligent task allocation. The proposed technique estimates the time-variant energy consumption with only 7% error across completely different architectures, enabling the intelligent task allocation that saves the energy consumption of 16.5% for state-of-the-art ML workloads.We next show how to further advance the learning procedures towards real-time and online processing by distributing such learning tasks onto the hierarchy of IoT devices. Our solution leverages brain-inspired high-dimensional (HD) computing to derive a new class oflearning algorithms that can easily run on IoT devices, while providing high accuracy comparable to the state-of-the-arts. We present that the HD-based learning algorithms can cover various real-world problems from conventional classification to other cognitive tasks beyond classical MLs such as DNA pattern matching. We demonstrate that the HD-based learning can enable secure, collaborative learning by efficiently distributing a large volume of learning tasks into heterogeneous computing nodes. We have implemented the proposed learning solution on various platforms while offering superior computing efficiency. For example, our solution achieves 486×and 7× performance improvements for each of the training and inference phases on a low-power ARM processor, as compared to state-of-the-art deep learning
CascadeHD: Efficient Many-Class Learning Framework Using Hyperdimensional Computing
The brain-inspired hyperdimensional computing (HDC) gains attention as a light-weight and extremely parallelizable learning solution alternative to deep neural networks. Prior research shows the effectiveness of HDC-based learning on less powerful systems such as edge computing devices. However, the many-class classification problem is beyond the focus of mainstream HDC research; the existing HDC would not provide sufficient quality and efficiency due to its coarse-grained training. In this paper, we propose an efficient many-class learning framework, called CascadeHD, which identifies latent high-dimensional patterns of many classes holistically while learning a hierarchical inference structure using a novel meta-learning algorithm for high efficiency. Our evaluation conducted on the NVIDIA Jetson device family shows that CascadeHD improves the accuracy for many-class classification by up to 18% while achieving 32% speedup compared to the existing HDC. © 2021 IEEE
A Framework for Efficient and Binary Clustering in High-Dimensional Space
Today's applications generate a large amount of data where the majority of the data are not associated with any labels. Clustering methods are the most commonly used algorithms for data analysis, especially in healthcare. However, running clustering algorithms on embedded devices is significantly slow as the computation involves a large amount of complex pairwise similarity measurements. In this paper, we proposed FebHD, an adaptive framework for efficient and fully binary clustering in high-dimensional space. Instead of using complex similarity metrics, e.g., Euclidean distance, FebHD introduces a nonlinear encoder to map data points into sparse high-dimensional space. FebHD encoder simplifies the similarity search, the most costly and frequent clustering operation, to Hamming distance, which can be accelerated in today's hardware. FebHD performs clustering by assigning each data point to a set of initialized centers. It then updates the centers adaptively based on: (i) data points assigned to each cluster, and (ii) the confidence of the model on the clustering prediction. This adaptive update enables FebHD to provide a high quality of clustering with very few learning iterations. We also propose an end-to-end hardware accelerator that parallelizes the entire FebHD computation by exploiting FPGA bit-level granularity. Our evaluation shows that FebHD provides comparable accuracy to state-of-the-art clustering algorithms, while providing 6.2× and 9.1× (4.7× and 5.8×) faster and higher energy efficiency when running on the same FPGA (GPU) platform. © 2021 EDAA
Algorithm-Hardware Co-Design for Efficient Brain-Inspired Hyperdimensional Learning on Edge
Machine learning methods have been widely utilized to provide high quality for many cognitive tasks. Running sophisticated learning tasks requires high computational costs to process a large amount of learning data. Brain-inspired Hyperdimensional Computing (HDC) is introduced as an alternative solution for lightweight learning on edge devices. However, HDC models still rely on accelerators to ensure realtime and efficient learning. These hardware designs are not commercially available and need a relatively long period to synthesize and fabricate after deriving the new applications. In this paper, we propose an efficient framework for accelerating the HDC at the edge by fully utilizing the available computing power. We optimize the HDC through algorithm-hardware co-design of the host CPU and existing low-power machine learning accelerators, such as Edge TPU. We interpret the lightweight HDC learning model as a hyper-wide neural network to take advantage of the accelerator and machine learning platform. We further improve the runtime cost of training by employing a bootstrap aggregating algorithm called bagging while maintaining the learning quality. We evaluate the performance of the proposed framework with several applications. Joint experiments on mobile CPU and the Edge TPU show that our framework achieves 4.5 × faster training and 4.2 × faster inference compared to the baseline platform. In addition, our framework achieves 19.4 × faster training and 8.9 × faster inference as compared to embedded ARM CPU, Raspberry Pi, that consumes similar power consumption. © 2022 EDAA